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Abstract 

Tree convex sets refer to a collection of sets such that each set in the collection is a subtree of a tree whose nodes 
are the elements of these sets. They extend the concept of row convex sets each of which is an interval over a total 
ordering of the elements of those sets. They have been applied to identify tractable Constraint Satisfaction Problems 
and Combinatorial Auction Problems. Recently, polynomial algorithms have been proposed to recognize tree convex 
sets. In this paper, we review the materials that are the key to a linear recognition algorithm. 

1 Introduction 

Given a set U, a collection S of subsets of U is tree convex if there exists a tree T with nodes U such that every set of 
S is a subtree (Zhang and Yap, 2003) of T. Row convex sets are a collection of sets that are tree convex with respect 
to a chain (a special tree with nodes U). Row convex sets correspond to another well studied concept: consecutive 
ones property of matrices. Let M be the matrix whose rows are indexed by the elements of S and columns indexed 
by those of U in terms of a total ordering over U. An entry of M, indexed by (s, a) with s G S and a £ U, is one if 
and only if a £ s. M has consecutive ones property (Fulkerson and Gross, 1965) with respect to its rows if there is a 
total ordering of U such that the ones on each row is consecutive. Clearly, the sets of S are row convex if and only if 
the matrix M has consecutive ones property. 

The property of tree convex and row convex sets has been employed to identify tractable Constraint Satisfaction 
Problems (CSP). CSP problems have found many successful applications in Artificial Intelligence and Combinatorial 
Problems (Dechter, 2003). However, in general, CSP problems are NP-hard. Continuous research effort has been made 
to identify tractable CSP problems. An important approach is to make use of semantic properties of the constraints. 
For monotone constraints, path consistency implies global consistency (Montanari, 1974). van Beek and Dechter 
(1995) generalize monotone constraints to a larger class of row convex constraints which is in turn expanded to tree 
convex constraints by Zhang and Yap (2003). The tractability of these constraints results from the nice intersection 
property of tree convex constraints. 

Recently, tree convex sets also have found applications in combinatorial auctions. Given a set U of items and a 
collection of bids each of which is a subset of U, the problem to decide the winners is NP-complete (Rothkopf et al., 
1998) in general. However, when the collection of bids are tree convex, the problem becomes tractable (Sandholm and 
Suri, 2003). (Note that although "tree convexity" is not used in that paper, the concept there is exactly the same as tree 
convexity.) 

An interesting and challenging question raised in the application of tree convex sets in both CSP and Combinatorial 
Auctions is how efficiently one can test the tree convexity of a given collection of sets. There is abundant related 
research work under the umbrella of consecutive ones property test, i.e., row convexity test. The consecutive ones 
problem was first proposed by Fulkerson and Gross (1965). A linear algorithm was then developed by Booth and 
Lueker (1976). It uses quite complex data structures and involved techniques. There exists continuous work, e.g., 
by Meidanis et al. (1998), Habib et al. (2000), and Hsu (2002), to improve the understanding of consecutive ones 
property and its test. For tree convexity test, polynomial algorithms have been recently designed by Yosiphon (2003) 
and Conitzer et al. (2004). Yosiphon makes use of complex data structures and ideas inherited from consecutive ones 
property work. The resulting algorithm is rather involved and has a complexity of 0(mn). Conitzer et al. proposes a 
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"simple" algorithm but with a still very high time complexity 0(mn 2 ) where m is the number of sets (bids) and n the 
number of all distinct elements in the sets, i.e., the number of all items to bid. 

A very interesting question is whether there are linear algorithms for tree convexity test like row convexity test. 
In fact, it is listed as one of the open questions in (Conitzer et al., 2004). This question can be answered positively 
if we take the collection of sets as a hypergraph. With this perspective, we are not only able to identify a simple and 
nice characterization of tree convex sets using hypergraphs and properties of hypergraphs, but also to connect this 
problem with the long line research of conjunctive query evaluation in databases and tree decomposition in Constraint 
Satisfaction Problems (Beeri et al., 1983; Dechter and Pearl, 1989; Gottlob and Szeider, 2008). As a result, an existing 
simple and elegant linear algorithm for hypergraphs by Tarjan and Yannakakis (1984) can be directly used to test tree 
convexity. 

Due to a well known example in Constraint Satisfaction Problems where an optimal algorithm AC-4 on enforcing 
arc consistency does not perform better than a non-optimal algorithm AC-3 (Wallace, 1993) in most cases, we also 
carry out experiments on a set of randomly generated problems to compare the linear algorithm with the one in 
(Conitzer et al., 2004). Experimental results show that the former is significantly faster than the latter. 

Section 2 reviews basic concepts and terms including those that might have different meanings in different context. 
The details of a characterization of tree convex sets and related work are given in Section 3. To make this survey self 
contained, a test algorithm including Tarjan et al.'s algorithm is presented in Section 4. Experimental results are given 
in Section 5 before we conclude the paper. 

2 Background 

In this section, we will review the basics of tree convex sets, the related concepts of graphs and hypergraphs, and some 
applications of tree convex sets in Constraint Satisfaction Problems and Combinatorial Auction problems. 

A graph is a tuple (N, E) where N and E are sets, elements of N are called vertices or nodes and those of E 
edges, and each edge is a set of at most two vertices. Hypergraphs generalize graphs by allowing an edge to be a set of 
arbitrary number of vertices. Specifically, a hypergraph H is a pair (A/ - , £) where N is a set of vertices, and £ consists 
of nonempty subsets of Af that are called hyperedges. Berge's book (1973) is an excellent reference for hypergraphs. 

2.1 Notations and results in graphs 

A clique of a graph is a set of pairwise adjacent vertices. A graph is chordal if every cycle of length at least four has 
a chord, i.e., an edge joining two nonconsecutive vertices on the cycle. Forests, trees, chains and (simple) path are 
defined as usual. To reduce the potential confusion or misunderstanding, we repeat the following definitions. A graph 
(Ni, Ei) is a subgraph of (7Y, E) if Ni C 7Y and E\ C E. Given a tree, a subtree is defined as a connected subgraph 
of the tree. A forest on a set 5 is a forest whose vertex set is exactly S. 

2.2 Notations and results in hypergraphs 

We introduce in this section dual hypergraphs, acyclic hypergraphs, join trees and some results on hypergraphs. 
Throughout this paper, we may use "graphs" for "hypergraphs" and "edges" for "hyperedges" when their meaning 
is clear from the context. 

The graph G(H) of a hypergraph H is the graph whose vertices are those of H and whose edges are pairs {a;, y} 
such that x and y are in a common edge of H. A hypergraph H is conformal if every clique of G(H) is contained in 
an edge of H. 

The dual graph H* of a graph H = {{vi,v 2 , v n }, {Si,S 2 , ■ S m }) is a hypergraph ({Si, S 2 , ■ ■ ■ , S m }, 
{Ri,R 2 , . . . , Rn}) where for i e l..n, Ri = {Sj | Vi e Sj, j e l..m}. The edge Ri is the set of edges of H that 
involve vertex Vi . Intuitively, one can take Ri as . 

The acyclicity of a hypergraph involves a sequence of concepts defined below. H is reduced if no edges of it 
properly contain another edge and every node is in some edge. The reduction of H is H with any contained edges and 
non-edge nodes removed. 

Let H = (J\f, £) be a hypergraph with nodes x and y in TV. A path from x to y in H is a sequence of edges 
Ei, E 2 , . . . , E k (k > 1), such that x € E lt y e E k and E { n E i+X ^ for i e [l..k - 1]. E x , E 2 , . . . , E k is also 
called a path from E\ to E k . 
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Two nodes (or edges) are connected if there is a path between them. A set of edges is connected if every pair of 
the edges is connected. A connected component of H is a maximal connected set of edges. 

Given a hypergraph and a subset of its nodes, we will now define the "projection" of the graph on these nodes. Let 
M be a set of nodes of the hypergraph (TV, £ ). The set of partial edges generated by M is defined to be the reduction 
of {E n M | E G £} — {%}. It is also called a node-generated set of partial edges. Given a set of edges T, we say 
(E, F), where E,F G T, is an articulation pair if E n F is an articulation set, i.e., removing ECiF from every edge 
in T strictly increases the number of connected components of T . 

A block of a reduced hypergraph is a connected node-generated set of partial edges without articulation set. A 
reduced hypergraph is acyclic if all its blocks have less than two edges. A hypergraph is said to be acyclic if its 
reduction is. 

As examples, consider the graphs in Figure 1(a) and Figure 1(b). The former is acyclic, following our intuition. 
However, the latter is also acyclic. Although a, e, c, a form a "cycle," the graph is acyclic by definition because they 
the cycle is covered by the edge {a,e,c}. 




Figure 1 : Acyclic graphs can be either tree convex or non tree convex. The letters are the vertices and the edges are 
represented by enclosed curves. 

We define join tree below. Given a collection S of sets: S = {Si, S2, ■ ■ ■ , S m }, the intersection graph for S, 
denoted Is, is the undirected graph (S, E) where {S t , Sj} G E iff Si (~l Sj ^= 0. A path S i± , S i2 , . . . , S ik of I s is an 
A-pathif A G r\S ij+1 for all j G l..k— 1. A subgraph G = (S,E') of I s is a join graph if for every pair of nodes 
Si and Sj of S and every A G Sj n Sj, there is an A-path from Si to Sj in G. A join tree is a join graph that is a tree. 
A hypergraph (Af, £) has a join tree if there is a join tree for £. Acyclic graphs and join trees are closely related as 
revealed by the following result. 

Theorem 1 ((Beeri et al., 1983)). The following statements on hypergraph H are equivalent: 

• H is acyclic. 

• H has a join tree. 

• H is conformal, and G{H) is chordal. 

2.3 Tree convex sets 

A collection of sets Si , S 2 , • • • , S m is tree convex with respect to a forest T on U ie i.. m Sj if every Sj is a subtree of T. 
For example, the sets {a, b, c}, {a, b, d}, and {a, c, d} are tree convex with respect to the tree with vertices {a, b, c, d} 
and edges {{a, b}, {a, c}, {a, d}}. 

2.4 Tree convex constraints and problems 

A binary constraint network consists of a set of variables V = {xi, X2, ••■ ,x n } with a finite domain Di for each 
variable Xi G V, and a set of binary constraints C over the variables of V. c xy denotes a constraint on variables x and 
y which is defined as a relation over D x and D y . Operations on relations, e.g., intersection (D), composition (o), and 
inverse, are applicable to constraints. The arc and path consistency are defined as in (Mackworth, 1977), and global 
(k consistency) consistency in (Freuder, 1978). 
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Given a constraint c xy , the image of a value a of x is the set of values of y that are compatible with a under c xy . 
A constraint c xy is tree convex with respect to a forest T on D y if the images of all values of D x are tree convex with 
respect to T. A constraint network is tree convex if there exists a forest on the domain of each variable such that every 
constraint c xy of the network is tree convex with respect to the forest on D y . 

If a tree convex constraint network is arc and path consistent, it is global consistent (Zhang and Yap, 2003), which 
implies that a solution can be found in polynomial time. 

2.5 Combinatorial auction problems 

Emerging as key mechanisms for allocating goods, tasks, resources etc., combinatorial auctions (Cramton et al., 2006) 
allow the bidders to bid on bundles of items, instead of single item. The problem to determine the winners in combina- 
torial auctions is NP-complete (Rothkopf et al., 1998). However, restricted classes of combinatorial auction problems 
have been identified. For those classes, there exist efficient polynomial algorithms. We are particularly interested in 
the class of problems where an item graph of the bids is a tree (Conitzer et al., 2004). 

Every bid is a set of items. Given a combinatorial auction clearing problem instance (i.e., a set of bids), the graph 
G = (I, E), where / corresponds to the items in the instance, is a (valid) item graph if for every bid, the set of items 
in that bid constitutes a connected subgraph of G. G is a item tree if it is a tree. 

It is straightforward to verify, by the definitions, that a set of bids is tree convex iff there is an item tree for the bids. 

Conitzer et al. proposed an algorithm to recognize tree convexity with complexity of 0(mn 2 ) where m is the total 
number of bids and n the number of total items in the auction. Given a collection of bids S = {S\, S2, ■ ■ ■ , S m }, the al- 
gorithm first constructs a graph with vertices L)S(= SiUS^U- • -USm), and weighted edges G = {({a, b}, weight) \ 3s G 
S such that a, b G s, and weight = \{s G 5* : a, b G s}\}. It next finds the maximum spanning tree T of G. 

The sets of S are tree convex iff the sets are tree convex with respect to T (Conitzer et al., 2004). 

3 Characterization of tree convex sets 

Given a collection of sets £ = {Si, S2, ■ ■ ■ , S m }, let U(S) — U se ss. The hypergraph of S is {U(S), S). The dual 
hypergraph of S is the dual graph of (U(S), S). 

To identify whether 5* is tree convex, one convenient way is to look at the hypergraph of S. Consider the example 
{{1, 3}, {1, 5}, {1, 9}} in Figure 1(a). Clearly, its hypergraph is acyclic and suggests a tree with respect to which the 
collection is tree convex. However, we have the following observations about the relationship between a collection of 
sets and the acyclicity of their hypergraphs. 

The graph of S is acyclic does not necessarily mean the tree convexity of S. In other words, the graph of a non 
tree convex sets could be acyclic. Consider the collection S = {{a, e, /}, {c, d, e}, {a, b, c}, {a, c, e}} in Figure 1(b). 
As mentioned before, S is acyclic. However, it is not tree convex. Assume otherwise it is tree convex with respect to 
a tree T. There are paths on T: PI : a — > c (because a, b, and c form a subtree of T), P2 : c — > e, P3 : e — > a. 
Clearly, P1P2P3 forms a cycle, a contradiction to the fact that T has no cycles. Another observation is that not all 
tree convex sets form an acyclic hypergraph. The example S — {{a, b, c}, {a, b, d, e}, {b, c, d}} (Figure 2(a)), given 
by Yosiphone 1 , is tree convex but not acyclic. Each set of S is a subtree of the tree shown in the figure. From the 
intersection graph of S in Figure 2(b), there does not exist a join tree for S. So, S is not acyclic. 

In fact, the tree convexity of a collection is related to the acyclicity of its dual graph. 

Theorem 2. A collection S of sets is tree convex iff its dual hypergraph is acyclic. 

Proof. Given a collection S of sets, let H = ({vi, V2, ■ ■ ■ , v n }, S) be its hypergraph. Here we take U(S) as 
{v\,V2, • • ■ , v n }. Let D = (5, {Ri, R2, • ■ • , R n }) be the dual graph of S. 

Necessary condition. Let T be a tree on U(S) such that S is tree convex with respect to it. The idea is to construct 
a join tree for D so that D is acyclic by Theorem 1. We now construct a tree T'=(V, E) where V = {Ri, R2, ■ • ■ , Rn}- 
For all Ri, Rj G V, {Ri, Rj} G E if and only if {vi, Vj} is an edge of T. We next show that T is a join tree for 
D. Consider any two vertices Ri and Rj such that Ri Pi Rj 7^ and any / G i?j n Rj (note / is an edge of H). 
By definition of dual graph, Vi ,Vj G / because / G Ri H Rj and Ri and Rj consist of edges involving Vi and Vj 
respectively. There is a unique path from Uj to Vj in T. Let it be P = Vi, . . . ,Vj. S is tree convex implies / 
is a subtree of T. Since both Vi and Vj belong to /, all vertices on P are in /. Corresponding to P, there is a path 

1 Personal communication 2004. 
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(a) 



(b) 



Figure 2: Tree convex sets might not be acyclic, (a) Straight lines represent edges of the underlying tree on the 
vertices, (b) Enclosed curves represent nodes which correspond to edges in (a). Letters on the straight edges represent 
the intersection of the nodes at their ends. 

P' = Ri 1 Ri+i, ■ ■ ■ , Rj in T' by the construction of T'. For all k G since Vk G /, we have / G Rk. Hence, P' is 
an /-path from Ri to Rj. Therefore, T is a join tree of D. 

Sufficient condition. Since the dual graph of S is acyclic, there is a join tree T = {\R\, R2, ■ ■ ■ , R n }, R) for D 
by Theorem 1. We will show that there is a tree T under which S is tree convex. Construct T = ({v\, V2, ■ ■ ■ , v n }, E) 
where (vi, Vj) G E if and only if {Ri, Rj} G R. Clearly, T is a tree. We next prove that for any s G S, s is a subtree 
of T. Specifically, we show that for any two vertices and Vj of the edge s, there exists a path from Vi to Vj in T and 
the nodes on the path are in s. By definition of dual graphs, s G Ri and s G Rj because , Vj G s. Since T' is a join 
tree of D, there is an s-path from R t to Ri,Ri + i, . . . , Rj in T'. By the construction of T, Uj, Wj+i, . . . , Vj is a 
path of T. For all fc G since s G -Rfc, we have v& G s. Hence, s is a subtree of T and thus 5 is tree convex. □ 

To illustrate the concepts used in the proof, consider the collection S = {{a, b, c}, {a, b, d, e}, {b, c, d}} again. Let 
d = {a, b, c}, e<i — {a, b, d, e}, and = {6, c, d}. The hypergraph of S is H = ({a, b, c, d, e}, {ei, 62, 63}) (Fig- 
ure 2(a)). The dual graph of S is D = ({ei, e2, 63}, {R a , Rb, Rc, Rd, Re}) (Figure 3(a)) where R a — {ei, e2}, i?6 = 
{ei, e2, 63}, i? c = {ei, 63}, i?d = {e2, 63}, i? e = {e 2 }. Since R e is a subset of Rd and other edges are subsets of Rb, 
we have a join tree shown in Figure 3(b). So, D is acyclic. From the join tree, we can construct a tree on the nodes of 
the original sets as in Figure 3(c). S is tree convex with respect to the tree. 



Figure 3: (a) The dual graph of S. Every edge has a label of R with subscript, (b) A join tree, (c) Tree derived from 
(b). The nodes are the elements in the original sets. 

A result similar to Theorem 2 was discovered by Goodman and Shmueli (1983) long time ago in the study of 
database schemas. They provided a rather comprehensive characterization of acyclic hypergraphs. One of their main 
results is the relationship between acyclic hypergraph and chordality and conformality which is well known by the 
constraint community (Beeri et al., 1983; Dechter, 2003). However, another result is not known well but directly 
related to the characterization of tree convexity. It is worth reviewing the result here. First, we introduce some of 
their terms that are not well known in the constraint community. In the case that confusion could arise from the use of 
common terminologies, we underline the terms. 

Given a hypergraph H = (N,£), a dual graph for H (Goodman and Shmueli, 1983) is a graph G = (Vg ,F) 
equipped with a one one onto map Vs to £ indicating which node of G represents which edge of £ . Note that G is not 
a hypergraph here, but just a graph. One type of dual graph used by Goodman and Shmueli is an intersection graph, 
denoted by Q(H). Cl(H) = (V £ ,F) such that {x,y} G F iff E x n E y ^ where E x and E y are the edges (of H) 
represented by x and y respectively. A second type of dual graph is a qual graph (Bernstein and Goodman, 1981). 
Given u G Af, the dual of u is u* = {E G £ \ u G E}. A qual graph for H is any dual graph G = {Vg , F) such that 




(b) 



c 



(a) 



(c) 
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for each u G Af, the subgraph of G induced by nodes representing elements of u* is connected. One can verify that 
the graph of Figure 3(c) is a qual graph of the hypergraph of Figure 3 (a). The nodes a to e of Figure 3(c) represent 
edges R a to R e . As an example, consider node e 3 . Its dual 63 = {R c , Rt, Rd}- The subgraph of Figure 3(c) induced 
by a, b, c (representing the elements of ep is connected. 

A database schema can be thought of as a hypergraph whose nodes are the schema's attributes and whose edges 
are the schema's relations. A hypergraph H is a tree schema if some qual graph for it is a tree. 

Now we are ready to present Goodman and Shmueli's result (Goodman and Shmueli, 1983, Theorem 6). 

Theorem 3 (Goodman and Shmueli 1983). A hypergraph H is a tree schema iff H is acyclic. 

Theorem 2 and 3 are equivalent. First, One can show that if a collection of sets is tree convex with respect to a 
forest, it is tree convex with respect to a tree, and vice versa. Next, by Theorem 2, hypergraph H is acyclic iff the 
collection of the edges of its dual graph, H*, is tree convex. Thirdly, a key observation is that the collection of edges 
of H* is tree convex iff some qual graph for H is a tree. By the definition of tree convexity, the former condition holds 
iff there exists a tree T with nodes of H* such that every edge of H* is a subtree of T. Clearly, by the definition of 
qual graph, T is a qual graph for H. Finally, by definition of tree schema, H is a tree schema iff there exists a qual 
graph for H. 

Recently, a nice and more general result on hypergraphs was discovered by Gottlob and Greco (Gottlob and Greco, 
2007). 

Theorem 4 (Gottlob and Greco 2007). Let k be a number and H = (Af, £) a hypergraph such that for each node 
v e Af, {?;} e £. Then, a k-width tree decomposition of an item graph for H exists if and only if H* has a (k+l)-width 
strict hypertree decomposition. 

Essentially, the hypergraph H is a set of bids (i.e., a collection of sets). A detailed explanation of the concepts 
of k-width tree decomposition of a graph and (k + l)-width (strict) hypertree decomposition of a hypergraph can be 
found in (Gottlob and Greco, 2007). This result relates a more general property of a hypergraph with some property 
of the its dual. A 1-width tree decomposition of an item graph for H exists if and only if an item graph for H is a tree, 
i.e., H is tree convex. By definition of strict hypertree decomposition, one can show that a hypergraph has a 2-width 
strict hypertree decomposition if and only if it is acyclic. So, Theorem 4 implies Theorem 2 and thus 3. 

Remark. Given a hypergraph H (representing the topological structure of a CSP problem), its dual (constraint) graph 
is defined as the intersection graph for H in (Dechter, 2003). Clearly, the dual graph is different from dual graph and 
dual (constraint) graph. The definition of intersection graph agrees with that of intersection graph. As for the defini- 
tions of acyclic graphs, we follow those in (Beeri et al., 1983). Acyclic hypergraphs are called hypertrees in (Dechter, 
2003), but a— acyclic graphs in (Fagin, 1983) where other types of acyclicity are also introduced. 

4 Algorithms to identify tree convexity 

By Theorem 2, we have the following algorithm to test the tree convexity of a given collection S and produce a tree if 
the given collection is tree convex. 



Algorithm 1 '. Recognize tree convexity of sets 
isTreeConvex (in S) 

1 Let D be the dual graph of S 

2 if isAcyclic(Z), R, 7) then 

3 genForest(Z), R, 7, T) 

4 return (true, T) 

else 

5 |_ return false 



The algorithm first constructs the dual graph D of S. The function isAcyclic(Z), R, 7) returns true and data 
structures R and 7 (discussed below) if the graph of D is acyclic, and it returns false otherwise. In the former case, 
using R and 7, genForest(Z), R, 7, T) builds tree T (using R and 7) with respect to which S is tree convex. 
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Based on the work by Rose et al. (1976), Tarjan and Yannakakis (1984) proposed a simple linear algorithm (max- 
imum cardinality search) to identify whether a hypergraph is acyclic. Although maximum cardinality search on a 
graph can be easily found in a wide range of references (Dechter, 2003), very few references involve the search over 
hypergraphs. We include it here to make our presentation complete, with the correction of some errors in the original 
presentation. 

Given a graph (TV, £), the key behind this algorithm is to compute three mappings a, /3,and 7. A mapping is a 
(possibly partial) function that assigns a node and/or an edge to a number between (including) 1 and \Af\ . Specifically, 
the domain of a is Af, that of (3 is Af and £, and that of 7 is £ . The algorithm, called restricted maximum cardinality 
search on hypergraph, works as follows. It first selects an edge s from £ arbitrarily. Mapping a assigns the nodes of s 
the number from n to n — \s\ + 1 one by one. An edge is exhausted if all of its nodes have been assigned a number by 
a, and nonexhausted otherwise. Next we select a nonexhausted edge t with the maximum number of nodes assigned 
by a (tie will be broken arbitrarily). Let rii be the largest number that is smaller than \J\f\ but not used by a yet. Assign 
the non-assigned nodes of t to numbers from ri\ to ri\ — \t\ + 1. Repeat this process until every node of the graph 
is assigned a number by a. R(i) is used to remember the i th selected edge. The mapping (3 is defined as follows. If 
s is the i th selected edge, j3(s) = i. Otherwise, it is not defined. For a node v, (3(v) is defined as (3(s) where s is 
the first selected edge such that v £ s, i.e., j3(v) — min{/3(s) | s is selected and v £ s}. (Note that in line 12 of the 
algorithm, (3(E) < — A: is redundant. We keep it there to make it compatible with the original algorithm. It also makes 
the definition of (3 clearer.) For each edge s, if s is not selected during the process, 7(3) is (3(v) where v £ s is the 
last one to be assigned a number by a, i.e., j(s) = max{/3(w) | v £ s}; if s is selected by the process, ■y(s) is (3{v) if 
v £ s is the last node assigned by a strictly before s is selected, i.e., j(s) = max{/3(v) | v £ s and (3(v) < (3(s)}, in 
the last case, if (3(v) = (3(s) for all v £ s, -f(s) is not defined. 

The mappings are then employed to test the acyclicity of a graph. Given a hypergraph H, assume totally k edges 
are selected during the process above. H is acyclic iff for each i £ l..k and each edge s such that j(s) = i, 
s n {v I (3(v) < i} C R(i). The code from line 26 to 32 implements this test. 

To compute the mappings in linear time, data structures set(i), size(s) and j are maintained during the process 
of building a. For each s, size(s) is the count of assigned vertices in s if s is nonexhausted and —1 otherwise. For 
i £ 0..n — 1, set(i) is the set of nonexhausted edges that have exactly i assigned vertices by a. Index j is the maximum 
i such that set(i) is nonempty. 

The algorithms to test acyclicity and generate the forest are of linear time complexity (Tarjan and Yannakakis, 
1984). Hence, we have the following result. 

Theorem 5. The worst case time complexity of the algorithm to identify the tree convexity of a collection of sets is 
linear in the problem size. 

Given a collection of sets S = {Si, S2, • • • , S m }, the size of the problem is S^ =1 (|S'j|). The complexity of the 
acyclicity based algorithm is linear to the problem size. Conitzer et al.'s algorithm has a complexity of 0(mn 2 ) where 
n = I U 5|. Note that the size of each set (bid) may range from 1 to n, but never exceeds n. So, the difference of the 
worst case complexity of the two algorithms is clear. 

Algorithm 2 differs from that of (Tarjan and Yannakakis, 1984) in the following two parts. 1) Line 14 was i + + 
in the original paper, which was clearly a typo. 2) Instead of having line 22-23, the original algorithm increases j by 
one right before line 25, which is not correct. Our newly added code in line 22-23 will preserve the linear complexity 
of the algorithm. In the complexity analysis, line 25 is the key. The number of executions of line 25 during the whole 
process can be taken as a combination of two parts: executions caused by the monotonic decrease of j, and those extra 
executions d caused by the increase of j in line 22-23. d is n in the worst case as every node of U(£ ) will be selected 
once and only once and for each selected node d will be increased by only one in the worst case. The new change 
follows the amortization spirit used in the original analysis. Therefore, Algorithm 2 still has linear complexity. 

In the following comment, we use the notations and refer to the original algorithm (page 573) in (Tarjan and 
Yannakakis, 1984). In a personal communication, Yanakakis and Tarjan points out two alternatives to correct the 
original algorithm. The first is to replace j := j + 1 by j := |i£(fc)|. The other way is to move j := j + 1 to the line 
immediately before the inner for loop. , i.e., line 15, where i is updated. 
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Algorithm 2: Acyclicity test and generation of the forest 



isAcyclic (in £, out R, 7) 

1 Let n be the number of nodes in U(£) 

2 for each i £ 0..n 1 do 

3 |_ set(i) <- 

for F e £ do 

si;ze(F) <- 
7(F) <— undefined 
add F to set(0) 

i <- n + 1, j <- 0, k <- 
while j > do 

delete any £ from sei(j) 
£; + + 

/3(F) <- fc,F(fc) <- E,size(E) < 1 

for ue£ such that is not assigned do 

i 

a(v) <— i, <— fc 

for F e £ such that v £ F and size(F) > do 
7 (F) ^fc 

delete F from set(size(F)) 
size(F) + + 
if size(F) < \F\ then 
add F to set(szze(F)) 
if j < size(F) then 
I j <— size(F) 

else 

|_ size(F) < 1 

while j > and set(j) = do j 

for v e ) do index(v) <— 
for each i e l..fc do 

for v £ R(i) do index(v) <— i 
for each E £ £ such that 7(F) = i do 
for w e F do 

if P(v) < i and index(v) < i then 
|_ return false 

return true 

genForest 

genForest (in £ , R, 7, Out T) 

33 V <-£ 

34 F^{{F,F( 7 (F))}| 

F £ £ and 7(F) is defined} 

35 T <- (V, F) 



5 Experimental evaluation 

We have carried out an experimental evaluation of the performance of the acyclicity based algorithm and the spanning 
tree based algorithm (Conitzer et al., 2004). The algorithm in (Conitzer et al., 2004) consists of two parts: the first 
part is to find a tree over the items (see the background section) and the second part is to test whether every set (bid) 
is a subtree of the constructed tree. Due to space limitation, no concrete algorithm for the second part is provided in 
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(Conitzer et al., 2004). However, it is mentioned in (Conitzer et al., 2004) that the missed algorithm is achievable in 
0{mn) where m is the number of sets (bids), and n the number of elements (items). To make this paper complete and 
the experiments here reproducible, we include an algorithm for the second part. The idea is to get the subgraph of the 
tree induced from each set (line 1-4) and then check the connectedness of each induced graph (line 5-6). 



Algorithm 3: Identify tree convex sets with respect to a given tree 

treeTest (in S, T) 

1 for each s G 5* do construct graph G s = (s, 0) 

2 for each edge {a, b} of T do 

3 for each s 6 S do 

4 if {a, b} e s then 

|_ add edge {a, b} to graph G s 

5 for each graph G s do 

6 if the connected component of G s is not equal to s then 

L return false 

7 return true 



For line 6, the connected component of a graph can be identified in linear time (Cormen et al., 1990). The 
complexity of the algorithm is 0(mn) due to the two loops (line 2 and 3). 

Recall that a collection of sets, i.e., a set of bids, is tree convex iff there is an item tree for the bids. So the algorithm 
in (Conitzer et al., 2004) is directly applicable to tree convexity test and thus no modification or reconstruction is 
necessary. Our implementation is faithful to the algorithm given in (Conitzer et al., 2004). The experiments are 
carried out on an AMD Opteron 2350 CPU (frequency 2.0 GHz) with Ubuntu Linux 9.04 of kernel 2.6.28-11. The 
algorithms are implemented using Python 2.6.2. 

From our implementation, we have the following comments about the simplicity of the algorithms. Both algorithms 
are conceptually quite simple. However, as for implementation, we find that the pseudo code and data structures of the 
acyclicity algorithm can be "directly" implemented. When we implement the spanning tree based algorithms we have 
to choose the data structures on graphs carefully so that all the complexity results follow. The final implementation 
code is much more complex and longer than that of the acyclicity based algorithm. 

Acyclic based and spanning tree based algorithms are evaluated on random problems (generated by ourselves) and 
the structured problems provided by Ley ton-Brown et al. (2000). 

5.1 Random problems 

Four parameters are employed to generate our own collections of sets: (to, n, ri, r%) where to denotes the number of 
sets of the collection to generate, the size of the sets is between r\ and ri, and each set takes values from 1 to n. 

The evaluation is designed as follows. Since the acyclicity based algorithm is theoretically faster than the spanning 
tree based algorithm, for large problems, its practical performance should also be faster. We sample a few problems 
with large configuration parameters to show how the difference between these two algorithms could be. From Table 1 
where the time is for 10 problem instances, the acyclicity based algorithm is one to two orders of magnitude faster 
than that of the spanning tree based algorithm. As the problem size grows, the cost of spanning tree based algorithm 
grows much faster than that of the acyclicity based algorithm. 



TO 


n 


T\ 


T2 


Acyclicity based 


Spanning tree based 


100 


100 


2 


10 


0.05 


1.03 


300 


300 


2 


30 


0.21 


15.99 


500 


500 


2 


50 


0.56 


69.40 



Table 1 : Performance for large parameters 

For small problems, theoretical time complexity might not fully agree with practical performance. Therefore, 
we employ a systematic comparison scheme: vary the value of m and r 2 respectively with other parameters fixed. 
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Specifically, we have tested the following configurations < m, 100, 2, r 2 > where m changes from 10 to 200 with a 
step of 10, and r 2 changes from 20 to 90 with step 10. 100 instances are generated from each configuration of the 
parameters. Samples of the results are shown in Figure 4 and Figure 5. 
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Performance of the algorithms on problems < m, 100, 2, 30 > with m changing from 10 to 200 with a step 



From the results, the acyclicity algorithm runs significantly faster than the spanning tree based algorithm. 

5.2 Existing structured problems 

The problems (Leyton-Brown et al., 2000) used in our experiments are arbitrary, matching, paths, regions, scheduling 
and Legacy (L1-L8). Their instances are generated from the program at http : / /www. cs . ubc . ca/~kevinlb/ 
CATS/. The details of the description of these problems can be found at (Leyton-Brown et al., 2000). Each problem 
instance is a set of bids. Our task is to check the tree convexity of the bids. The results are listed in Table 2. In the 
table, each time entry is for 50 instances. From Table 2, the acyclicity based algorithm is 30 to 80 times faster than the 
spanning tree base algorithm. It is worth of mentioning that all the instances in the benchmarks are not tree convex, 
which partially justify our use of random problems that include both tree convex and non tree convex instances. 

In summary, for both random problems and structured problems, the acyclicity based algorithm has a clear perfor- 
mance advantage over the spanning tee based algorithm. 



6 Conclusion 

Polynomial algorithms have been designed to test tree convexity using ideas from consecutive ones property test and 
spanning tree. However, when the collection of sets is taken as a hypergraph, one can characterize the tree convexity 
by the acyclicity of the dual graph of the sets, which leads to a linear test algorithm thanks to the linear algorithm for 
testing the acyclicity of hypergraphs. In addition to its theoretical worst case efficiency, the acyclicity based algorithm 
is also very easy to implement and performs very well compared with the spanning tree based algorithm on the random 
problems we have generated. We notice that the algorithms to test row convexity (i.e., consecutive ones property) have 
been much more involved than the algorithm to test tree convexity although efforts have been made to find simpler 
algorithms (Habib et al., 2000; Meidanis et al., 1998). We are not aware of any work on consecutive ones property 
employing the properties of hypergraphs. It is interesting to investigate whether hypergraph properties and algorithms 
can help produce efficient and simple consecutive ones property test algorithms. 
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Figure 5: Performance of the algorithms on problems with < 50, 100, 2, r2 > with varying from 20 to 90 with step 
10. 



Instance 


Acyclicity based 


Spanning tree based 


arbitrary 


0.58 


34.67 


arbitrary-npv 


0.59 


34.01 


arbitrary-upv 


0.59 


34.91 


matching 


0.18 


6.14 


paths 


0.29 


16.27 


regions 


0.61 


35.19 


regions-npv 


0.62 


33.68 


regions-upv 


0.63 


35.37 


scheduling 


0.17 


42.38 


LI 


2.57 


159.84 


L2 


4.04 


324.02 


L3 


0.17 


8.59 


L4 


0.16 


6.95 


L5 


0.22 


13.76 


L6 


0.29 


18.43 


L7 


1.61 


84.95 


L8 


0.62 


8.8 



Table 2: Performance of the algorithms on the benchmarking problems in (Leyton-Brown et al., 2000) 
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